So today I want to talk about security and cognitive radio networks.
Now, there are a lot of companies right now that are spending a lot of money on a prediction.
And the prediction is that over the next ten years, the number of connected devices
and the amount of data that they're going to use is going to increase very, very fast.
And so new protocols and new specifications and new hardware even is being developed
to deal with this engineering problem of how do we add billions more devices
to networks that are already really congested, especially wireless networks.
And a lot of the times when these new protocols are designed,
it's really important, obviously, to think about how we make sure data integrity is maintained.
And in the kind of networks I'm going to talk about today,
not a whole lot of thought has really gone into it yet.
And I think the way that security research kind of normally works is
we start out with a system that's deployed over a really large area,
and then someone discovers a problem with it, right?
And then we talk about it, and everyone freaks out,
and then it's up to the vendor or whoever is responsible
to try to resolve the problem.
And then we sort of repair it, and then everyone sort of relaxes a little bit
until it happens all over again.
And so we sort of have this cyclical band-aiding of almost everything that we interact with.
And I think we're at a really unique point in cognitive radio networks
because it's far long enough now to where there are actual deployments in the field,
but it's not so far along now where if we discover problems with it
and we can actually make suggestions to how things should be changed,
then it's not going to affect millions of people yet.
But to understand how we can make these improvements,
first we have to understand the system that we're talking about.
So initially we had radio, right?
Radio, you've got a transmitter and receiver,
and if you want to change how the radio operates,
then you have to turn a knob.
And you've got to be standing there physically to turn the knob,
and the knob only turns so far one way or another.
Okay, so then technology advanced a little bit further,
and we came up with software-defined radio.
Now we can turn those knobs in software,
and we can do it remotely,
and we can even control how far the knob turns one way or another.
Okay.
Okay.
Okay.
Okay.
Okay.
And a huge technological step from this
is cognitive radio,
which is adding a feedback loop into the radio itself.
So now the radio is capable
of observing its surroundings
and then changing itself,
changing its own parameters
to optimize whatever it's trying to do.
And we have a whole bunch of these together,
and we have, of course, a cognitive radio network.
So now not only are individual radios
talking amongst themselves,
trying to understand how they can better
increase their performance,
but they can talk
talk to each other and they can talk back to a central base station somewhere and inform
each other about what's going on. So another word for this might be a doctive network where
they're actually teaching each other. Now, to actually send information in the physical
layer, it's important that we understand this, otherwise it will be a little difficult to
follow. But to actually send information over a wave, we can fully define a wave by three
parameters. It's frequency, which is how often a certain point reappears. It's amplitude
and it's phase. And so by fiddling with one or more of these parameters, then we do what's
called modulation. So that's how we actually represent information in a wave. Now, there
are lots of different ways you can do modulation, of course. If we start with a simple cosine
wave like this, the top is the time domain and the bottom is the frequency domain. If
we frequency modulate this, then in the time domain we see the frequency is actually changing
with time.
And in the frequency domain, you can see all those frequencies. There are more complicated
ways to do this, too. For instance, this is quadrature amplitude modulation. It's more
difficult just by eye to see what's going on here. But the point is there are lots of
different ways you can do this. And each different form of modulation has different tradeoffs.
So by changing individual parameters in each different form of modulation, we can tradeoff
things like how fast it will go versus the bandwidth it will take up, stuff like that.
So cognitive intervention is sort of like this. It's a little bit more complicated.
It's sort of the hyped-up word for what you might describe as the thing that actually
controls and commands individual cognitive nodes. So there are lots of different parameters
that the cognitive engine controls. And these are examples of some of them. Now, you'll
notice that these are all in different layers of abstraction. So some of them are all the
way down in the physical layer and some of them are higher. And some of them even depend
on each other. So the cognitive engine is trying to accomplish some task, like let's
say it's trying to minimize interference and it's trying to maximize data rate. So
we would call that its objective function. So it's got a bunch of inputs to its objective
function. These are examples of inputs that it would have. And by changing those inputs,
it's going to achieve some output of this function. And it's going to either try to
minimize or maximize this function. So if we were to plot that, this is a simple
three-dimensional example. Normally you would have way more than three dimensions, so it
would be impossible to visualize like this. But a simple technique that a cognitive engine
might use to try to optimize whatever it's trying to do is something like gradient descent,
or other simple machine learning techniques. And this works relatively well, but obviously
there are dangers of hitting local max and min instead of global. And so there are several
other techniques that it can use. Another one is game theory. And this actually works
really well because Spectrum is a resource that is being fought over by lots of different
people. And so we can model this in game theory really easily. And one of the ways
that we can do this is to try to achieve what's called, I don't know, what's called
Pareto optimality, which means that if you've got a single cognitive radio network in a
room by itself and there's no one else there, then there's nothing it can do, no change
it can make when it's at its Pareto optimality that will increase its performance without
also decreasing the performance of another node. So this is sort of like the idealized
case where you win every time. Now, in real life, of course, this is never true. You're
always competing with other people and there's other things going on in the network and maybe
there's malicious users in the network. So in that case, you can try to attempt to do
an attempt for a Nash equilibrium, which means that now instead of trying to win everything
all the time, you're basically trying to not lose all the time. In other words, as long
as everyone is using the same strategy, there is no change that any one player can make
on their strategy that will also not decrease the strategy of another player. Now, again,
this gets difficult to actually achieve in real life and so you can try to approximate
it. But the point is there's a lot of really interesting ways that a cognitive engine could
try to optimize this objective function.
Back in the 1800s when wireless telegraph was first showing up, it was really, really
spectrally inefficient. And that was because it used spark gap transmitters, which just
trashed the spectrum. So when operators were trying to talk to each other, they would have
to listen to see if anyone was there. And if there wasn't, then they could go ahead
and start talking. And so, as you can imagine, this did not end up working very well at all.
And, in fact, one of the more common messages that people sent was, you know, this is a
GTOOMQRT, which stood for go to hell, old man, I'm trying to transmit. And so they would
just yell at each other until someone gave way and they could actually transmit. And
this really became a big problem when the Titanic sank, believe it or not, because after
they investigated the sinking, they realized that some of the shore transmitters were actually
interfering with some of the ship rescue efforts. And so in 1912, the Radio Act of
1912 was passed.
Which created the FCC and created licensing. So now people realized to make sure that no
one is interfering with each other, we need to have licenses. Everyone is responsible
for their own chunk of spectrum and you're definitely not allowed to transmit where anyone
else is. And this worked well for a while. But eventually, as more and more wireless
devices came online and especially in, like, the 80s and 90s when cell phone companies
started becoming big and they started aggressively lobbying Congress to buy more and more spectrum,
the spectrum was divided up into smaller and smaller pieces.
And some rudimentary spectrum sharing began because as long as you're not obviously in
the same physical location or you're not operating at the same time, then you're not going to
interfere with each other.
Now, more recently, something really interesting has happened. Back in the 90s when people
started seeing this problem of, okay, maybe we should figure out how to ration spectrum
a little better, people began setting up spectrum observatories, which are basically just a
spectrum analyzer on a building somewhere with an antenna. And they would watch this really
wide band and see what was happening. And they would watch this really wide band and
see what was actually going on. How are people really using their licenses? And they found
out that, surprisingly, it's actually full of holes. And a lot of the spectrum that
people are buying isn't actually used, or at least it's not used very efficiently.
So recently, several years ago, the FCC, after being petitioned by Google, decided
to make unlicensed the old analog TV channels. So this is called TV white space. And Google
was kind of the primary push of that.
And so the problem is ‑‑
Because the licenses are not the same everywhere, the availability of stuff like white space
isn't available everywhere. So this is a plot, for example, of one channel of TV white space
across the United States. And all the blue is where it's available and the green is where
it's not available. So as you change channels, the availability of different frequencies
changes as well. So this is the entire United States. It's a little deceiving on this plot
because it's not plotting density, it's plotting different channels and different colors. But
you can see it's mostly around a couple big cities where there's not a lot of spectrum
available.
There's no TV channels out there. So why not use it?
And there are several companies now that have begun really taking advantage of this. The
FCC has set up specific databases with Microsoft and a company called Spectrum Bridge and Google.
And the way it works is you query these databases and you tell them where you are and they'll
return to you a list of the frequencies that you're allowed to use without paying for them.
And these companies have also ‑‑ Microsoft and Google ‑‑ have also been able to use
have begun using this in what some people are calling super Wi‑Fi which is basically
just taking 802.11 and shifting it down to these frequencies which is around ‑‑ it's
in UHF so it's around 500 to 600 megahertz. And they're doing lots of trials right now.
There's around 40 experimental installations in the United States. They've also got trials
going all over Africa and Kenya and South Africa, Tanzania, Singapore, Senegal, everywhere.
And so this is really interesting. But there's other uses for this as well. There's a company
in France called Sigfox that rather than use this as just another way to do Wi‑Fi is
trying to use this for long‑range wireless sensors that are specifically really low power.
And so rather than connect people on a traditional network, they're connecting, for instance,
farmers who need to measure, you know, the moisture of their field and you've got a really
large area. This is the kind of stuff that they're working on. So especially for these
low‑power devices, we need new protocols. We need new ways to deal with these interesting
physical
and political properties. So there's another company in England called
Newell that is developing a protocol called Waitlist. And Waitlist is kind of interesting
because it's set up as a special interest group, but it's set up as a private special
interest group. Now, Bluetooth did the same thing. So that means that if you want to contribute
to the spec, you have to pay a bunch of money. And in the case of Bluetooth, the way it worked
was you paid a bunch of money to contribute and then afterwards, once they release the
spec, you can download the entire spec for free.
Waitlist, for some reason, is working a little differently. And if you want to just read
the spec, which they've now released, version 1‑0, it cost almost $1,000. And they claim
that it's an open spec. So I'm not really sure how that works, but I hope that they
perhaps take a turn because I'm sure there'll be a lot of people interested in how this
actually works and poking around at it. So now briefly, I want to talk about some
of the kinds of attacks that specifically apply to cognitive radio networks. Now, a
lot of traditional networks ‑‑ networks all over the world are connected to an ensemble,
network attacks will also work on cognitive radio networks but a lot of times they will
work in different ways. So obviously I'm not going to enumerate every single kind of attack
here but I want to give you an idea of the kinds of things you have to think about when
dealing with networks like this because it takes a little bit of different thinking.
So I'm sure one attack everyone in here is familiar with is a replay attack, right? You
take some traffic off the network and then you store it and then you play it back at
a later time. So on a regular network, if you don't handle that correctly, that can
be bad. But on a cognitive radio network, it can mean different things because your
cognitive engine is trying to ‑‑ it's constantly monitoring the network and trying
to decide how it can better optimize it, what improvements it can make and how it can make
sure it's not interfering with everyone else. So if it sees traffic returning to it that
it has already seen, then it may assume one of two things. It may assume that there's
a routing problem, especially if it's an ad hoc kind of network, or it may assume there's
some weird RF thing happening, perhaps if it's seeing a large reflection off of a surface
or something.
And it may try to adjust for it. So taking advantage of the assumptions that the cognitive
engine makes is, I think, a large attack surface. One of the more maybe obvious methods that
you might have think of when attacking these kind of networks is changing the observations
that individual nodes can see. So if a legitimate node can observe an incumbent on some channel,
and the way it would observe an incumbent, there's several different ways. It can use
something as simple as energy thresholding.
So if there's some power above some threshold, it decides that there's a person there. Or
it can use more complicated ways, for instance, cyclostationary analysis or wavelet analysis.
It can actually try to characterize the signal more. I'm not going to go into exactly how
those work, because the math is really hairy, but in any case, it discovers that there's
a person there, and will forward this message along the network until it hits perhaps a
compromised node, in which case the message can be changed. And once this is forwarded
along to the basenet, it will just come back to the base of the network and we can see
station, this can cause different decisions to be made. And this can do one of two things.
First of all, it means that the real incumbent is now going to be ignored and you can effectively
turn the entire network into your own jammer. And the other advantage of this particular
attack is that you don't have to transmit anything to make it work. So rather than having
to set up your own radio and potentially be triangulated or something, you can simply
change traffic to change what appear to be observations.
A simpler version of this would be routing disruption. Again, another attack that is
well documented in traditional networks. But if a node either starts dropping packets or
completely drops off the network, then this can be really bad for the cognitive engine
because if that particular area physically where the node is located is collecting really
valuable data, then it's now blind in that part of the network. And so that can drastically
change how the entire network is going to work.
By the way, you don't need some kind of complicated exploit to make a small node, especially
the kinds that are typically used in sensor networks to act like a black hole. A baseball
bat will also work to take the node off the network.
The Sybil attack, another originally designed for peer‑to‑peer networks, the idea being
that if you've got a trust relationship between individual nodes or the base station, then
you can take advantage of that by either taking over additional nodes or taking over
or adding more to the network. So, especially in these cases, it's really important to know
who you can trust information from and when you can trust that it's ‑‑ that it's
real. And so keeping track of individual nodes and whether or not they're being suspicious
is really important. So if you get enough of your own nodes on the network, then you
now have basically voting majority and you can vouch for compromised nodes on behalf
of each other. And in this way you can indirectly control the decision. So if you have enough
decisions that the network is going to make, because you can just feed it whatever you
want it to hear, and then you can get a pretty good idea of what it's going to have
to do in response. A priority attack is another interesting attack that ‑‑ another interesting
attack that is perhaps unique to this kind of network. The idea is that you've got sensors
in different places, like let's say you've got a sensor in a laboratory that's measuring
the moisture of a fern or something. And then you've got another sensor in the same
network that is measuring toxic fume levels in the lab. Okay, well, clearly the one that's
measuring fume levels should be much higher priority than the one measuring moisture in
the plant. And so by exploiting ‑‑ by telling the cognitive engine that you're a
higher priority than you really are, then you can derive resources away from places
that really need it. Because especially in these cases where you're sharing spectrum,
there's a finite amount of resources to go around. And so it's more easy to sort of
clamp those off. Whenever people are designing hardware, especially
these kind of networks, it's really, really easy, especially when you're designing small
nodes, to rationalize weak crypto. And the reason this is because if you're, you know,
working on a microcontroller and you're writing an assembly and you're trying to squeeze every
cycle out, it can be really easy to say, you know, it doesn't ‑‑ like, who's really
going to try to, you know, break into this? Or it doesn't really matter if someone is
able to read this traffic. And a lot of it also comes down to speed versus security,
right? Because in a network like this, you've got a lot more network overhead. And so understanding
what tradeoff to be made is really hard, but it's really, really important. And it's
easy to screw up. Data privacy on a normal network is obviously important, but in these
kind of networks, it gives you more information about the nodes themselves than perhaps on
a regular network. For instance, location. So if you have a network that has a lot of
nodes, it can be much easier to discover for an individual node because the spectrum
that it's observing is really, really specific to where it physically is. And individual
trees and buildings and stuff around specific nodes can drastically affect the spectrum
that they're observing, and you can characterize that. And so you can figure out physically
where they are. And this can be really bad if you're trying to ‑‑ if you're trying
to keep that secure. So this, I think, primary user emulation is
a really challenging attack that we're going to have to deal with in these kinds of networks.
Primary user emulation ‑‑ well, first I guess I should explain what a primary user
is in the context of the FCC. What they talk about there is a ‑‑ for instance, a TV
transmitter who actually owns the license, they would be the primary user. And then everyone
who's sharing the spectrum with them is the secondary user. So normally the way this works
is you look up the database, and that tells you all the primary users, and you know, okay,
well, if they're not on this list, then they must not be a primary user. So this is kind
of an exploit of that.
With both technology and policy. Because the problem is, as the law stands right now,
there's no way to authenticate a primary user. So if you set up a radio somewhere and you
start rebroadcasting episodes of happy days in the middle of a network, then you can potentially
doss the entire network off the air. And even though the network may very, very well suspect
that you're doing something bad, there's nothing legally they can do about it because they
have to get out of your way. And so figuring out exactly how to deal with this has been
really tricky. And there's several papers that have been
written on special cases for this. But no one has really figured out how to deal with
it. And I think it's going to require a combination of some really clever algorithms for characterizing
real primary users versus fake ones, and also some policy change on how we're able
to detect them and what you're able to do once you discover that they're there. Because
at this point, once they doss you off the entire network, then essentially it becomes
a jamming problem, and you can try to use something like spread spectrum to get around
that, but it's not ideal.
So those were obviously not every possible attack. Those are just general ideas. And
similarly, these are general ideas on countermeasures for how to deal with some of these things.
And not all of this has been enumerated yet, but I think there are some several key important
ideas that we're going to have to think of when we're trying to solve these issues.
The first is using cooperative intrusion detection. Traditionally, you know, we see maybe a single
intrusion detection system, but in the case where you've got a bunch of nodes that are
all talking to each other, then I don't think you're going to be able to get through it.
each other. Because they can inform each other, they should be able to inform each other about
each other. So not only are they observing the spectrum in general, but they should be
observing each other's behavior and keeping each other accountable. If they observe strange
traffic, then they need to alert each other and adjust their trust functions accordingly.
Device reputation is another really important thing. By keeping track of the quality of
the spectrum that each individual node is receiving, again, as well as their traffic,
then you can build this trust function and you can sort of weight your decisions based
on that. And this is not only something malicious, right? If there's something physically wrong
with a node and it starts reading weird spectrum, then that's another legitimate reason why
you need to know, okay, we should factor these observations less into our decisions.
And device location, again, is another important aspect for this. Because, for example, if
physically ‑‑ physical security on nodes like this is a big deal. And having physical
access to them, even a single node, can significantly affect the network and change the decisions
that are being made. So why does this matter? Why should we try to work on some of these
problems? Well, this plot right here has probably been seen by, I would guess, every major networking
executive in the entire world. And this is showing, of course, the mobile data prediction
over the next couple of years. And Cisco is predicting insane numbers. They're saying
that by 2020 there will be 50 billion devices connected together on the network. Right now
there's about 10. So there are a lot of these companies that are both predicting and kind
of freaking out and preparing for what they think is going to be this really big deal.
So it almost doesn't matter whether or not this actually happens because they're sort
of self‑fulfilling prophecy. You know, they're predicting that it's going to happen and
preparing for it. So I think that this will be relevant either way.
But this is the spectrum map of the United States right now. And you can see how fragmented
it is. This goes from 3 kilohertz to 300 gigahertz. So this is the current solution.
It's chopping it up into smaller pieces. And obviously we can't keep doing this. And eventually
we're going to have to figure out how to deal with that. So this is another application
I think of Cognitive Radio specifically is cell phone towers. As the density of cell
phone towers increases and as we see more and more people using cell phone towers, there's
going to be the proliferation of femtocells. I mean, the number of transmitters is getting
closer and closer together and so they have to make sure they're not interfering with
each other. If you've got a femtocell in every house, then you're going to have problems.
And some of these are already beginning to do some very, very simple cognitive aspects
to them where they'll try to avoid each other. And I'm only seeing that as going to continue
to increase because it's going to let them be even more efficient.
So to sort of do experiments with this and to play around with this, we need to start
with tools. And if you've ever done any work in RF before, then probably the first thing
that comes to your mind is the USRP, which is this really neat little software-defined
radio. The only downside is that it's a little expensive. I've actually written some experimental
cognitive engine code, some base station code in GNU radio that will run on the USRP and
I'll link to it at the end if you want to play with it. So this is good for acting like
sort of the base station that will make decisions and command smaller nodes.
However, if you're trying to do experiments with a network, then you typically need multiple
nodes and maybe you can afford one USRP, but you probably can't afford like five of them.
So the other end of this is the really, really cheap, you know, XP type thing, which is just
a little wireless module, kind of you put data in one side and it magically comes out
the other side. And the good thing is that it's really, really cheap, so you can buy
a bunch of them. But the problem is they're not frequency agile at all and they're not
very customizable. You can't control them very well.
So I kind of wanted something in between. And so there wasn't really anything at the
time, so I built something. So I built this board that I called Level and it goes from
30 megahertz to 4.4 gigahertz, outputs about 60 milliwatts. It uses a chip by TI that's
based on the MSP430, which I'll talk about in a second. So it's compatible with TI's
really cool off‑the‑shelf mesh networking stack called Simplicity. And it fits onto
our Arduino shields.
It isn't an Arduino shield, which I'll clarify what that means in a second. And they cost
about 100 bucks. So this is what it looks like. And I'll briefly go over some of the
topology here. This is the CC430, which is a microcontroller, like I said, by Texas Instruments.
It's got a MSP430 core in it as well as a CC1101 transceiver core in it. It's low power
and it's relatively low bandwidth as well, so it's good for doing low power sensor kind
of stuff.
The local oscillator is this part by Analog Devices. This is an ADF4351 wideband VCO.
Those are mixed together in this ADEX10L. This is a passive mixer. And because it uses
a single antenna, it's got two RF switches that are controlled by GPIO on the MSP430,
so you can switch from transmit to receive mode. And then it runs through a bunch of
filters and some amplifiers.
And I also added these two things. So you can see here, this is a single antenna.
And then you can see these two things, which are optionally populated. These are directional
couplers. And these basically let you tap into the RF signal that's coming out of the
MSP430 directly and the ADF4350 directly without going through the mixer and filters
and amplifiers and everything. So it helps with debugging.
This is what I meant by it fits onto Arduino shields. As I was building this, I thought
it would be pretty cool if you could actually interfere ‑‑ if you could interact with
with other devices. For instance, your laptop obviously can do Wi‑Fi but it can't do stuff
in 500 megahertz. I was working with TV white space and I wanted to play around with that.
So I realized that Arduino shields typically have similar SPI pin outs for a lot of these
break‑out boards for pretty much everything. And so it fits right on there. And once you've
de‑packetized whatever you're receiving on the top board, then you can just send it
over serial to an Arduino shield which will typically do all the hard parts for you and
turn it into 802.11 or whatever you want. So this actually is on a Wi‑Fi shield and
this is on an Ethernet shield. So this board, by the way, is still ‑‑ I would still
consider it kind of a prototype. And I don't really have a way to mass manufacture them
right now. However, code and firmware and everything and schematics are all on GitHub
and I'll link that at the end.
And if there's enough interest, then we can see what we can do.
There are other tools out there, too, that are really good for this stuff. The HackRF
by Michael Ostman, which just launched on Kickstarter a couple days ago, pretty neat
tool. The BladeRF, which also was on Kickstarter earlier this year. And then there's another
board that I found out about very recently called the MyriadRF, which is this pretty
neat little board that ‑‑ it's not quite as frequency‑adjustable as the other boards.
But it's really neat. And so all three of these are really good tools for playing with
this. And all three of them didn't exist when I was originally designing my board, which
is why I didn't use any of them.
So what's next? Well, the whole spectrum crunch thing, depending on who you talk to,
some people would say that, you know, it's imminent and we're all doomed and some people
will say, well, you know, maybe we have a little more time than we thought. You know,
know, there's new techniques people are using that might buy us more time.
But here's what we know. We know that a lot of these companies that have a whole lot of
money are investing a lot of money in cognitive radio networks. They've been doing experiments
in turning entire cell phone towers into cognitive nodes. And I think that we're at a really
unique time because, like I said, these are deployed to the point where there's actually
real networks in the field right now. I mean, in France with Sigfox, they've got at this
point apparently thousands of devices connected to paying customers. And there's dozens and
dozens of installations in the United States right now. Actually, West Virginia University
just a couple weeks ago started serving Wi‑Fi to some of their dorms over TV white space.
And so we're at this really cool time where
the networks actually exist, but they're not used by so many millions of people that
it's too late to really change fundamentally how they work.
And so I think by attacking these kinds of problems and by trying to solve these, I mean,
really nontrivial issues, be it either technological or political, we can actually solve these
problems. We can really be on our way towards making sort of the next generation network
and making sure that we're able to deal with whatever the results of these predictions
end up being. Thank you.
